Overwatch heroes Lucio, Wrecking Ball and Soldier 76 in action
Our dataset consists of information on competitive gamers who play the video game Overwatch on Playstation 4. Overwatch is a team-based multiplayer first-person shooter developed and published by Blizzard Entertainment. Overwatch assigns players into two teams of six, with each player selecting from a roster of 30 characters, known as “heroes”, each with a unique style of play whose roles are divided into three general categories that fit their role. Players on a team work together to secure and defend control points on a map or escort a payload across the map in a limited amount of time.
Overwatch has a large community and E-Sports presence online. Players’ skill in competitive games is calculated by a “secret” formula at Blizzard that leads to a “skill rating”, or “SR” for short. SR ranges from 0 to 5,000, the higher the score the better the player.
Among the community, SR is divided into categories depending on how high the rating is, ranging from Bronze to Grandmaster:
We scraped a snapshot of PS4 players’ SR (it changes from game to game) whose profiles were public on overwatchtracker.com. We then scraped players’ career statistics from the games that they’ve played from the open source API ovrstat.com which returns convenient JSON formatted data.
Currently we have over two thousand player skill ratings and over two thousand predictor variables. However, our research question will allow us to tailor our question to a small subset of the predictors, around 60 or so.
Overall we’re interested in the question: if a player wants to improve their SR, what should they focus on? Should they try to eliminate more opponents? Heal their teammates? Or, play a certain character? Answers like these will be provided by a predictive model of SR using career player statistics as predictors. The answers we find will allow any player to most efficiently improve their SR and begin climbing their way to Grandmaster!
The answers we find could be used by amateurs and pro Overwatch gamers alike. We think of our analysis as the start of something like “Moneyball” for Overwatch.
df = read_csv('data/clean-data.csv')
relevel top hero - brian
glimpse - kai
df %>% glimpse()
## Observations: 2,316
## Variables: 61
## $ skill_rating <dbl> 2535, 3815, 2909, 1735, 2…
## $ assists.defensiveAssists <dbl> 208, 26, 5, 557, 412, 0, …
## $ assists.healingDone <dbl> 113872, 461978, 16315, 33…
## $ assists.offensiveAssists <dbl> 16, 683, 30, 303, 448, 8,…
## $ average.allDamageDoneAvgPer10Min <dbl> 10696, 11647, 10539, 7440…
## $ average.barrierDamageDoneAvgPer10Min <dbl> 2132, 4929, 2498, 2264, 3…
## $ average.deathsAvgPer10Min <dbl> 7.41, 6.47, 8.67, 6.89, 8…
## $ average.eliminationsAvgPer10Min <dbl> 22.31, 20.30, 20.87, 18.1…
## $ average.finalBlowsAvgPer10Min <dbl> 12.76, 11.25, 13.11, 6.18…
## $ average.healingDoneAvgPer10Min <dbl> 878.00, 4094.00, 500.00, …
## $ average.heroDamageDoneAvgPer10Min <dbl> 8223, 6490, 7773, 4977, 6…
## $ average.objectiveKillsAvgPer10Min <dbl> 9.18, 8.15, 6.62, 8.69, 8…
## $ average.objectiveTimeAvgPer10Min <dbl> 52, 78, 39, 76, 72, 45, 8…
## $ average.soloKillsAvgPer10Min <dbl> 3.53, 2.14, 3.34, 1.10, 2…
## $ average.timeSpentOnFireAvgPer10Min <dbl> 76, 76, 68, 69, 55, 154, …
## $ best.allDamageDoneMostInGame <dbl> 27906, 29403, 19787, 1565…
## $ best.barrierDamageDoneMostInGame <dbl> 12549, 14998, 6376, 7599,…
## $ best.defensiveAssistsMostInGame <dbl> 10, 20, 2, 29, 36, 0, 35,…
## $ best.eliminationsMostInGame <dbl> 60, 56, 40, 40, 52, 24, 6…
## $ best.environmentalKillsMostInGame <dbl> 1, 5, 2, 4, 2, 0, 2, 1, 2…
## $ best.finalBlowsMostInGame <dbl> 36, 30, 30, 17, 35, 12, 3…
## $ best.healingDoneMostInGame <dbl> 3354, 9320, 3072, 13869, …
## $ best.heroDamageDoneMostInGame <dbl> 20399, 17370, 14664, 1038…
## $ best.killsStreakBest <dbl> 60, 56, 40, 40, 52, 24, 6…
## $ best.meleeFinalBlowsMostInGame <dbl> 4, 5, 3, 1, 1, 3, 7, 5, 2…
## $ best.multikillsBest <dbl> 4, 3, 4, 3, 5, 3, 4, 4, 3…
## $ best.objectiveKillsMostInGame <dbl> 30, 35, 18, 23, 29, 9, 31…
## $ best.objectiveTimeMostInGame <dbl> 280, 377, 169, 257, 369, …
## $ best.offensiveAssistsMostInGame <dbl> 5, 46, 13, 27, 21, 8, 20,…
## $ best.soloKillsMostInGame <dbl> 36, 30, 30, 17, 35, 12, 3…
## $ best.teleporterPadsDestroyedMostInGame <dbl> 1, 1, 0, 0, 3, 0, 1, 1, 3…
## $ best.timeSpentOnFireMostInGame <dbl> 482, 353, 305, 307, 518, …
## $ best.turretsDestroyedMostInGame <dbl> 22, 11, 3, 11, 9, 0, 4, 1…
## $ combat.barrierDamageDone <dbl> 276462, 556147, 81535, 12…
## $ combat.damageDone <dbl> 1066469, 732220, 253681, …
## $ combat.deaths <dbl> 961, 730, 283, 368, 1064,…
## $ combat.eliminations <dbl> 2894, 2291, 681, 967, 228…
## $ combat.environmentalKills <dbl> 1, 32, 3, 9, 20, 0, 5, 2,…
## $ combat.finalBlows <dbl> 1655, 1269, 428, 330, 102…
## $ combat.heroDamageDone <dbl> 1066469, 732220, 253681, …
## $ combat.meleeFinalBlows <dbl> 33, 153, 4, 4, 2, 3, 29, …
## $ combat.multikills <dbl> 24, 11, 6, 4, 30, 1, 4, 1…
## $ combat.objectiveKills <dbl> 1190, 920, 216, 464, 1040…
## $ combat.objectiveTime <dbl> 6692, 8793, 1263, 4066, 9…
## $ combat.soloKills <dbl> 458, 241, 109, 59, 266, 3…
## $ combat.timeSpentOnFire <dbl> 9869, 8592, 2217, 3660, 6…
## $ game.gamesLost <dbl> 55, 59, 14, 21, 56, 0, 14…
## $ game.gamesTied <dbl> 3, 2, 1, 0, 3, 0, 0, 2, 0…
## $ game.gamesWon <dbl> 56, 41, 13, 29, 53, 1, 20…
## $ matchAwards.cards <dbl> 58, 26, 7, 27, 29, 1, 11,…
## $ matchAwards.medals <dbl> 354, 395, 68, 143, 269, 5…
## $ matchAwards.medalsBronze <dbl> 97, 143, 20, 41, 95, 2, 4…
## $ matchAwards.medalsGold <dbl> 169, 112, 33, 54, 83, 2, …
## $ matchAwards.medalsSilver <dbl> 88, 140, 15, 48, 91, 1, 4…
## $ miscellaneous.teleporterPadsDestroyed <dbl> 5, 4, 0, 0, 9, 0, 1, 1, 3…
## $ miscellaneous.turretsDestroyed <dbl> 134, 59, 16, 67, 74, 0, 2…
## $ assists.reconAssists <dbl> 2, 0, 1, 0, 4, 0, 0, 2, 0…
## $ best.reconAssistsMostInGame <dbl> 2, 0, 1, 0, 4, 0, 0, 2, 0…
## $ top_hero <fct> soldier76, roadhog, mccre…
## $ games_played <dbl> 114, 61, 79, 104, 193, 67…
## $ top_hero_type <chr> "damage", "tank", "damage…
Above is a normalized (to sum to 1) histogram of the reponse we want to model skill_rating. It appears that skill_rating (blue bars) looks a lot like a normal distribution (the orange line). This is a good thing, as this makes it easier to adhere to the assumptions of linear regression model we will use to explain the variation in player skill.
birefly mention hero types and what heros are: https://en.wikipedia.org/wiki/Characters_of_Overwatch#Characters
# borrowed from week 8 HW
diagnostics <- function(model,
pcol = 'grey',
lcol = 'dodgerblue',
alpha = 0.05,
plotit = TRUE,
testit = TRUE
){
if (plotit){
par(mfrow=c(1,2))
# plot 1 - fitted vs resid
plot(fitted(model), resid(model), col = pcol, pch = 20,
xlab = "Fitted", ylab = "Residuals", main = "Residual versus fitted plot")
abline(h = 0, col = lcol, lwd = 2)
# plot 2
qqnorm(resid(model), main = "Normal Q-Q Plot", col = pcol)
qqline(resid(model), col = lcol, lwd = 2)
}
if (testit){
st <- shapiro.test(resid(model))
decision <- ifelse(st$p.value < 0.05, 'Reject', 'Fail to Reject')
return(list(p_val=st$p.value, decision=decision))
}
}
df %>% summary
## skill_rating assists.defensiveAssists assists.healingDone
## Min. :1051 Min. : 0.0 Min. : 0
## 1st Qu.:2233 1st Qu.: 25.0 1st Qu.: 17430
## Median :2570 Median : 113.0 Median : 65304
## Mean :2587 Mean : 285.9 Mean : 154075
## 3rd Qu.:2917 3rd Qu.: 328.0 3rd Qu.: 184397
## Max. :4416 Max. :11406.0 Max. :6412326
##
## assists.offensiveAssists average.allDamageDoneAvgPer10Min
## Min. : 0.0 Min. : 0
## 1st Qu.: 23.0 1st Qu.: 6871
## Median : 84.0 Median : 8704
## Mean : 180.0 Mean : 8582
## 3rd Qu.: 222.2 3rd Qu.:10331
## Max. :2809.0 Max. :21925
##
## average.barrierDamageDoneAvgPer10Min average.deathsAvgPer10Min
## Min. : 0 Min. : 0.000
## 1st Qu.:1830 1st Qu.: 6.730
## Median :2552 Median : 7.460
## Mean :2640 Mean : 7.503
## 3rd Qu.:3348 3rd Qu.: 8.230
## Max. :9885 Max. :15.740
##
## average.eliminationsAvgPer10Min average.finalBlowsAvgPer10Min
## Min. : 0.00 Min. : 0.000
## 1st Qu.:14.88 1st Qu.: 5.107
## Median :17.30 Median : 6.920
## Mean :16.82 Mean : 7.135
## 3rd Qu.:19.41 3rd Qu.: 9.015
## Max. :35.36 Max. :26.020
##
## average.healingDoneAvgPer10Min average.heroDamageDoneAvgPer10Min
## Min. : 0 Min. : 0
## 1st Qu.: 1025 1st Qu.: 4644
## Median : 2213 Median : 5772
## Mean : 2872 Mean : 5676
## 3rd Qu.: 4184 3rd Qu.: 6734
## Max. :13199 Max. :13439
##
## average.objectiveKillsAvgPer10Min average.objectiveTimeAvgPer10Min
## Min. : 0.000 Min. : 0.00
## 1st Qu.: 6.268 1st Qu.: 60.00
## Median : 7.465 Median : 75.00
## Mean : 7.336 Mean : 76.86
## 3rd Qu.: 8.530 3rd Qu.: 93.00
## Max. :18.200 Max. :229.00
##
## average.soloKillsAvgPer10Min average.timeSpentOnFireAvgPer10Min
## Min. :0.000 Min. : 0.00
## 1st Qu.:0.640 1st Qu.: 35.00
## Median :1.140 Median : 53.00
## Mean :1.369 Mean : 54.66
## 3rd Qu.:1.900 3rd Qu.: 69.25
## Max. :8.970 Max. :355.00
##
## best.allDamageDoneMostInGame best.barrierDamageDoneMostInGame
## Min. : 0 Min. : 0
## 1st Qu.:16373 1st Qu.: 6004
## Median :22157 Median : 9410
## Mean :22232 Mean : 9601
## 3rd Qu.:27594 3rd Qu.:12682
## Max. :69929 Max. :40390
##
## best.defensiveAssistsMostInGame best.eliminationsMostInGame
## Min. : 0.00 Min. : 0.00
## 1st Qu.:13.00 1st Qu.:34.00
## Median :26.00 Median :42.00
## Mean :25.52 Mean :41.35
## 3rd Qu.:37.00 3rd Qu.:50.00
## Max. :85.00 Max. :91.00
##
## best.environmentalKillsMostInGame best.finalBlowsMostInGame
## Min. : 0.000 Min. : 0.0
## 1st Qu.: 0.000 1st Qu.:14.0
## Median : 1.000 Median :20.0
## Mean : 1.488 Mean :20.4
## 3rd Qu.: 2.000 3rd Qu.:26.0
## Max. :11.000 Max. :56.0
##
## best.healingDoneMostInGame best.heroDamageDoneMostInGame
## Min. : 0 Min. : 0
## 1st Qu.: 6339 1st Qu.:10599
## Median :11944 Median :13842
## Mean :11538 Mean :13968
## 3rd Qu.:16354 3rd Qu.:17333
## Max. :33089 Max. :35887
##
## best.killsStreakBest best.meleeFinalBlowsMostInGame best.multikillsBest
## Min. : 0.00 Min. :0.000 Min. :0.000
## 1st Qu.:34.00 1st Qu.:1.000 1st Qu.:3.000
## Median :42.00 Median :1.000 Median :4.000
## Mean :41.35 Mean :1.503 Mean :3.207
## 3rd Qu.:50.00 3rd Qu.:2.000 3rd Qu.:4.000
## Max. :91.00 Max. :8.000 Max. :6.000
##
## best.objectiveKillsMostInGame best.objectiveTimeMostInGame
## Min. : 0.00 Min. : 0.0
## 1st Qu.:17.00 1st Qu.:177.0
## Median :22.00 Median :259.0
## Mean :21.96 Mean :267.6
## 3rd Qu.:27.00 3rd Qu.:345.0
## Max. :56.00 Max. :776.0
##
## best.offensiveAssistsMostInGame best.soloKillsMostInGame
## Min. : 0.00 Min. : 0.0
## 1st Qu.: 9.00 1st Qu.:14.0
## Median :15.00 Median :20.0
## Mean :15.86 Mean :20.4
## 3rd Qu.:22.00 3rd Qu.:26.0
## Max. :59.00 Max. :56.0
##
## best.teleporterPadsDestroyedMostInGame best.timeSpentOnFireMostInGame
## Min. :0.0000 Min. : 0.0
## 1st Qu.:0.0000 1st Qu.:184.0
## Median :0.0000 Median :296.0
## Mean :0.7539 Mean :298.3
## 3rd Qu.:1.0000 3rd Qu.:410.0
## Max. :6.0000 Max. :938.0
##
## best.turretsDestroyedMostInGame combat.barrierDamageDone
## Min. : 0.000 Min. : 0
## 1st Qu.: 3.000 1st Qu.: 25613
## Median : 6.000 Median : 74947
## Mean : 6.633 Mean : 147924
## 3rd Qu.:10.000 3rd Qu.: 187504
## Max. :29.000 Max. :1943961
##
## combat.damageDone combat.deaths combat.eliminations
## Min. : 0 Min. : 0.0 Min. : 0.0
## 1st Qu.: 59714 1st Qu.: 86.0 1st Qu.: 186.0
## Median : 163050 Median : 226.0 Median : 515.0
## Mean : 320381 Mean : 404.9 Mean : 946.4
## 3rd Qu.: 404488 3rd Qu.: 529.0 3rd Qu.: 1204.0
## Max. :3774790 Max. :4501.0 Max. :11197.0
##
## combat.environmentalKills combat.finalBlows combat.heroDamageDone
## Min. : 0.000 Min. : 0.0 Min. : 0
## 1st Qu.: 0.000 1st Qu.: 69.0 1st Qu.: 59714
## Median : 2.000 Median : 201.0 Median : 163050
## Mean : 5.305 Mean : 406.8 Mean : 320381
## 3rd Qu.: 6.000 3rd Qu.: 492.0 3rd Qu.: 404488
## Max. :103.000 Max. :5924.0 Max. :3774790
##
## combat.meleeFinalBlows combat.multikills combat.objectiveKills
## Min. : 0.000 Min. : 0.00 Min. : 0.0
## 1st Qu.: 1.000 1st Qu.: 1.00 1st Qu.: 82.0
## Median : 3.000 Median : 5.00 Median : 223.0
## Mean : 9.567 Mean : 10.17 Mean : 411.5
## 3rd Qu.: 11.000 3rd Qu.: 12.00 3rd Qu.: 534.2
## Max. :232.000 Max. :144.00 Max. :6059.0
##
## combat.objectiveTime combat.soloKills combat.timeSpentOnFire
## Min. : 0.0 Min. : 0 Min. : 0
## 1st Qu.: 869.8 1st Qu.: 10 1st Qu.: 538
## Median : 2303.5 Median : 32 Median : 1574
## Mean : 4212.1 Mean : 78 Mean : 3152
## 3rd Qu.: 5661.2 3rd Qu.: 89 3rd Qu.: 4007
## Max. :57495.0 Max. :1674 Max. :46230
##
## game.gamesLost game.gamesTied game.gamesWon matchAwards.cards
## Min. : 0.0 Min. : 0.00 Min. : 0.00 Min. : 0.00
## 1st Qu.: 5.0 1st Qu.: 0.00 1st Qu.: 5.00 1st Qu.: 3.00
## Median : 13.0 Median : 0.00 Median : 13.00 Median : 9.00
## Mean : 22.6 Mean : 1.12 Mean : 23.07 Mean : 16.26
## 3rd Qu.: 29.0 3rd Qu.: 2.00 3rd Qu.: 31.00 3rd Qu.: 21.00
## Max. :264.0 Max. :14.00 Max. :265.00 Max. :336.00
##
## matchAwards.medals matchAwards.medalsBronze matchAwards.medalsGold
## Min. : 0.0 Min. : 0.00 Min. : 0.00
## 1st Qu.: 26.0 1st Qu.: 8.00 1st Qu.: 8.00
## Median : 67.5 Median : 20.50 Median : 22.50
## Mean : 121.0 Mean : 38.09 Mean : 42.67
## 3rd Qu.: 155.0 3rd Qu.: 49.00 3rd Qu.: 53.00
## Max. :1477.0 Max. :392.00 Max. :728.00
##
## matchAwards.medalsSilver miscellaneous.teleporterPadsDestroyed
## Min. : 0.00 Min. : 0.000
## 1st Qu.: 8.75 1st Qu.: 0.000
## Median : 22.00 Median : 0.000
## Mean : 40.23 Mean : 1.724
## 3rd Qu.: 53.00 3rd Qu.: 2.000
## Max. :409.00 Max. :56.000
##
## miscellaneous.turretsDestroyed assists.reconAssists
## Min. : 0.00 Min. : 0.000
## 1st Qu.: 5.00 1st Qu.: 0.000
## Median : 18.00 Median : 0.000
## Mean : 37.48 Mean : 2.877
## 3rd Qu.: 46.00 3rd Qu.: 4.000
## Max. :746.00 Max. :35.000
##
## best.reconAssistsMostInGame top_hero games_played
## Min. : 0.0 reinhardt: 221 Min. : 11
## 1st Qu.: 0.0 moira : 201 1st Qu.: 39
## Median : 0.0 mercy : 155 Median : 72
## Mean : 2.3 lucio : 154 Mean :104
## 3rd Qu.: 4.0 orisa : 142 3rd Qu.:133
## Max. :30.0 dVa : 131 Max. :672
## (Other) :1312
## top_hero_type
## Length:2316
## Class :character
## Mode :character
##
##
##
##
fit_add_full = lm(skill_rating ~ . -top_hero_type, data = df)
diagnostics(fit_add_full, testit = FALSE)
fm_diag = diagnostics(fit_add_full, plotit = FALSE)
# TO DO WRITE OUT TEST RESUTS OF SHAPIRO WILK
fm_bp = bptest(fit_add_full)
# TO DO WRITE OUT TEST RESUTS OF BP
fm_bp
##
## studentized Breusch-Pagan test
##
## data: fit_add_full
## BP = 163.24, df = 84, p-value = 5.079e-07
There are two major problems in the full additive model: heteroskedasticity and non-normal residuals. We can try to find the correct model and apply transformations to the predictors. Or, what we’ll do instead is think more carefully about the predictors and hand-pick a smaller model to start with based on exploratory data analysis and our knowledge of Overwatch.
pairs plots response ~ 5 predictors - kai
correlation - brian
find_cor_sr <- function(data){
M <- cor(data %>% select_if(is.numeric))
M[row.names(M) == 'skill_rating', !(colnames(M) %in% c('rank', 'skill_rating'))]
}
linear_cors = find_cor_sr(df)
sort(linear_cors, decreasing=TRUE) %>% head(5)
## best.meleeFinalBlowsMostInGame best.offensiveAssistsMostInGame
## 0.2598209 0.2320754
## average.barrierDamageDoneAvgPer10Min average.allDamageDoneAvgPer10Min
## 0.2308899 0.2224033
## combat.meleeFinalBlows
## 0.2183293
sort(linear_cors, decreasing=FALSE) %>% head(5)
## average.objectiveTimeAvgPer10Min average.objectiveKillsAvgPer10Min
## -0.131716523 -0.115678491
## average.deathsAvgPer10Min average.soloKillsAvgPer10Min
## -0.030222118 -0.012667216
## best.objectiveTimeMostInGame
## 0.001759417
cor(df$combat.meleeFinalBlows, df$best.meleeFinalBlowsMostInGame)
## [1] 0.6548119
cor(df$average.allDamageDoneAvgPer10Min, df$average.barrierDamageDoneAvgPer10Min)
## [1] 0.8441916
cor(df$average.allDamageDoneAvgPer10Min, df$best.offensiveAssistsMostInGame)
## [1] -0.003758676
Explain best.meleeFinalBlowsMostInGame and why combat.meleeFinalBlows is redundant. wtih this.
Explain average.allDamageDoneAvgPer10Min and why it’s more interpretable than average.barrierDamageDoneAvgPer10Min which is a result of your overall damage done per 10 minutes.
Explain average.objectiveKillsAvgPer10Min and why it’s correlated with average.objectiveTimeAvgPer10Min but more actionable since it’s more specific about what to do when a player is on the objective area.
And that’s it. Those are the most correlated linear variables with SR we have.
A player may need to get better to improve their “average” statistics laid out in the correlated variables above. But one thing any player can always do is play more. So we also want to consider the number of games played as a predictor of skill_rating as it’s both actionable and an obvious variable to control for, i.e. are the most skilled just those who have played the most?
cor(df$skill_rating, df$games_played)
## [1] 0.1143257
par(mfrow=c(1,2))
plot(skill_rating ~ games_played, data = df)
plot(skill_rating ~ log(games_played), data = df)
We can see that the natural log transform of games_played makes the positive relationship with skill_rating easier to see and brings in the long tail of players who have played many more games than the median player. This will help prevent heteroskedasticity with this predictor in the linear model.
step bic + aic - kai
show r^2 - kai
2 anova tests - kai
library(readr)
clean_df <- read_csv("data/clean-data.csv")
## Parsed with column specification:
## cols(
## .default = col_double(),
## top_hero = col_character(),
## top_hero_type = col_character()
## )
## See spec(...) for full column specifications.
predictors = colnames(clean_df)[1:60]
predictors = append(predictors, "games_played")
for (name in predictors) {
plot(as.formula(paste("skill_rating ~ ", paste(name))), data = clean_df)
}
## Warning in plot.formula(as.formula(paste("skill_rating ~ ",
## paste(name))), : the formula 'skill_rating ~ skill_rating' is treated as
## 'skill_rating ~ 1'
diagnostics of final model - brian
2 anova tests - kai
So, what should a player who wants to improve their skill rating focus on?